METHOD AND APPARATUS FOR ENCODING HYBRID 
INTRA-INTER CODED BLOCKS 

r-pnsR.RFFEREN '^P TO RELATED APPLICATION 

This application claims the benefit of U.S. Provisional Application Senal No. 
60/497,816 (Attorney Docket No. PU030258), filed August 26. 2003 and entitled 
■METHOD AND APPARATUS FOR HYBRID MACROBLOCK MODES FOR VIDEO 
CODECS", which is incorporated herein by reference in its entirety. 

J cicin OF THE INVENTION ■ . , , 

The invention relates generally to digiUI video CODECS, and more part,cularly 
to the hybrid use of both intra and Inter coding for macrobjocks. 

Rar.i^f;RntlND ^ F THF INVENTION 
,5 A video encoder can be used to encode one or more frames of an .mage 

sequence into digital informafion. This digital information may then be transmitted to a 
receive, where the image or the image sequence can then be re-const™cM 
(decoded). The transmission channel itself may include any of a number of possible 
channels for transmission. For example, the transmisston channel might be a rad,o 
Channel or other means for wireless broadcast, coaxial Cable Television cable, a 
GSM mobile phone TDMA channel, a fixed line telephone link, or the Internet. Th 
list of transmission means Is only illustrative and is by no means meant to be all- 

various international standards have been agreed upon for video encoding 
as and transmission. In general, a standanl provides rules for compressing ar^ 
encoding data relaUng to frames of an image. These rules provide a way of 
compressing and encoding image data to transmit less data than the viewing camera 
originally provided about the image. This reduced volume of data |hen req^es less 
chLel bandwidth for transmission. A receiver can re^nstmct (or decode) the 
30 image from the transmitted data it knows the mies that the transmitter used to 
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perform the compression and encoding. The H.264 standard minimizes redundant 
transmission of parts of the image, by using motion compensated prediction of 
macroblocl<s from previous frames. 

Video compression architectures and standards, such as MPEG-2 and 
JVT/ H.264 /MPEG 4 Parti 0/AVC, encode a macroblocl< using only either an 
intraframe ("intra") coding or an Interframe ("inter) coding method for the encoding of 
each macroblock. For interframe motion estimation/compensation, a video frame to 
be encoded is partitioned into non-overlapping rectangular, or most commonly, 
square blocks of pixels. For each of these macroblocks. the best matching 
macroblock is searched from a reference frame in a predetermined search window 
according to a predetemiined matching error criterion. Then the matched macroblock 
is used to predict the current macroblock. and the prediction error macroblock is 
further processed and transmitted to the decoder. The relative shifts in the horizontal 
and vertical directions of the reference macroblock with respect to the original 
macroblock are grouped and referred to as the motion vector (MV) of the original 
macroblock, which Is also transmitted to the decoder. The main aim of motion 
estimation is to predict a macroblock such that the difference macroblock obtained 
from taking a difference of the reference and cun-ent macroblocks produces the 
lowest number of bits in encoding. 

For intra coding, a macroblock (MB) or a sub-macroblock within a picture is 
predicted using spatial prediction methods. For inter coding, temporal prediction 
methods (i.e. motion estimation/compensation) are used. Generally, inter prediction 
(coding) methods are usually more efficient than Intra coding methods. In the existing 
architectures/standards, specific picture or slice types are defined which specify or 
restrict the intra or inter MB types that can be encoded for transmission to a decoder. 
In intra (I) pictures or slices, only intra MB types can be encoded, while on Predictive 
(P) and Bi-predictlve (B) pictures or slices, both intra and inter MB types may be 
encoded. 

An l-picture or l-slice contains only intra coded macroblocks and does not 
, use temporal prediction. The pixel values of the current macroblock are first spatially 
predicted from their neighboring pixel values. The residual Infbmiatlon is then 
transfomied using a NxN transfom) (e.g.. 4x4 or 8x8 DOT transfomn) and then 

quantized. 
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B-pictures or B-slices, introduce the concept of bi-predicKve (or m a 
generalization mulBple-predicHon) inter coded mac«>blocl. types, where a maorol*,ck 
(MB) or sub-block is predicted by t«o (or more) interframe predictions. Due to b>- 
prediction, B pictures usually tend to be more efficient in coding than both 1 and P 

pictures.^ p pj^^g „^ B.pl^re may contain different slice types, and macroblocks 
encoded by different methods. A slice can be of 1 (Intra). P (Predk:ted). B (Bh 
Dredicled),SP (Switching P), and Si (Switching 1) type. 

intra and Inter prediction methods have been used separately, within video 
coding architectures and standards such as MPEG-2 and H.264. For intra coded 
niacroblocks. available spatial samples within the same frame or picture are used to 
predict current macroblocks. while in inter prediction, temporal samples withm other 
pictures or otherframes. are instead used. In the H.264 standard, two diffe^nt ,ntra 
coding modes exist: a 4x4 intra mode which pe,fom,s the prediction process tor eveiy 
4x4 block within a macroblock; and a 16x16 intra mode, for which the prediction is 
perfonned for the entire macroblock in a single step. 

Each frame of a video sequence is divided into so-called "macroblocks , 
which comprise luminance (Y) lnfom,a«on and associated (potentially spatially sub- 
sampled depending upon the color space) chrominance (U, V) nfom,at,o. 
Macroblocks are formed by representing a region of 16x16 image p«els in he 
original image as four 8x8 blocks of luminance (luma) Infonnation. ea^ lum nanc« 
block comphslng an 8x8 array of luminance (Y) values; and two spatially 
corresponding chrominance components (U and V) which are sub-sampled by a 
factor of two in the horizontal and vertical directions to yield corresponding anays of 
25 8x8 chrominance (U,V) values. ,rv1R 
,„ 16x16 spatial (intra) prediction mode ttie luma values of an entire 16x 6 
macroblock are predicted from fl.e pixels around «,e edges of the MB. In the 16x16 
inti. prediction mode, the 33 neighboring samples immediately above and^or to tt,e 
left of me 16X16 luma block are used for the prediction of «ie current macroblock and 
30 thatonly4modes(0ve,tical.1horizontal.2DC.and3planepredlction)areused. 

4 1 illusLs ti,e intrafreme (Intia) prediction sampling method for the 4x4 
intra mode in the H.264 standard of tt,e related art The samples of a 4x4 luma block 
110 ,0 be intra encoded containing pbcels V through V in FIG 1 are predicted using 
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nearby pixels "A" through "M" in FIG 1 from neighboring blocks, in the decoder, 
samples "A" through "M" from previous macroblocks of the same picture/frame 
typically have been already decoded and can then used for prediction of the current 
macroblock110. 

FIG 2 illustrates, for the 4x4 luma block 110 of FIG1 the nine intra prediction 
modes labeled 0. 1. 3. 4. 5. 6. 7. and 8. Mode 2 is the 'DC-prediction'. The other 
modes (1. 3. 4. 5. 6. 7. and 8) represent directions of predictions as indicated by the 
arrows in FIG 2. 

The intra macroblock types that are defined in the H.264 standard are as 
follows: 
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Table 1 - Intra Macroblock types 





Name of inb_type 


MbPartPredMode 
(nib_type,0) 


Intral6xl6 
PredMode 


CodedBlock 1 
PatternChro 


:odedBlock 


0 


I 4x4 


Intra 4x4 


NA 





NA 


1 


I_16xl6_0_0_0 


Intra 16x16 


0 


0 




2 


I 16xl6_l_0_0 


Intra_16xl6 


1 





0 


3 


I 16xl6_2_0_0 


Intra 16x16 


2 


0 




4 


I 16xl6_3_0_0 


Intra 16x16 


3 


0 




5 


I 16xl6_0_l_0 


Intra 16x16 


0 


i 




6 


I_16xl6_l_l_0 


Intra 16x16 


1 


r 




7 


I 16xl6_2_l_0 


Intra 16x16 


2 


1 




8 


I_16xl6_3_l_0 


Intra 16x16 


3 


1 




9 


I 16xl6_0_2_0 


Intra 16x16 


0 


2 




10 


I_16xl6_l_2_0 


Intra 16x16 


1 


2 


^ 


11 


I_16xl6_2_2_0 


Intra_16xl6 


2 


2 




12 


I_16xl6_3_2_0 


Intra 16x16 


3 


2 




13 


I_16xl6_0_0_l 


Intra_16xl6 


0 


0 




14 


l_16xl6_l_0_l 


Intra_16xl6 


1 


0 




15 


I_16xl6_2_0_l 


Intra_16xl6 


2 


0 




16 


I_16xl6_3_0_l 


Intra 16x16 


3 


0 




17 


I_16xl6_0_l_l 


Intra 16x16 


0 


1 


15 


18 


I_16xl6_l_l_l 


Intra 16x16 


1 


1 


15 


19 


I_16xl6_2_l_l 


Intra 16x16 


2 > 


1 


15 


20 


I_16xl6_3_l_l 


Intra 16x16 


3 


1 


15 


21 


l_16xl6_0_2_l 


Intra 16x16 


0 


2 


15 


22 


I_16xl6_l_2_l 


Intra_16xl6 


1 


2 


15 


23 


I 16xl6_2_2_l 


Intra 16x16 


2 


2 


15 


24 


I 16xl6_3_2_l 


Intra 16x16 


3 


2 


15 


25 


I PCM 


NA 


NA 


NA 


NA 



FIG 3 depicts a current macroblock 310 to be Inter coded In a P-ftame or P- 
slice using temporal prediction, instead of spatial prediction, by estimating a moUon 
vector {i e., MV, Motion Vector) between the best match (BM) among the blocks of 
two pictures (301 and 302). In inter coding, a current block 310 in the current frame 
301 is predicted from a displaced matching block (BM) in the previous frame 302. 
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Every inter coded block (e.g., 310) is associated with a set of motior, parameters 
(motion vectors and a reference index refjdx). which provide to the decoder a 
corresponding location within the reference picture (302) associated with ,«fjdxf«)m 
which ail pixels in the block 310 can be predicted. The difference between the 
original block (310) and its prediction (BM) is compressed and transmitted along with 
the displacement motion vectors (MV). Motion can be estimated independently for 
either 16x16 macroblock or any of its sub^acroblock partitions: 16x8. 8x16, 8x8. 8x4, 
4x8 4x4 An 8x8 r„acroblock partition is known as a sub-macroblook (or subblock). 
He.«inafter, the tem, -block" generally refers to a rectangular group of adjacent ptxels 
of any dimensions, such as a whole 16 x 16 macroblock and/or a sub-macroblock 
partition Only one moUon vector (MV) per sub-macroblock partition Is allowed. The 
motion can be esUmated for each macroblock from different frames either in the past 
or in the future, by associating the macroblock with the selected frame using the 
macroblocks refjdx. 

A P-slice may also contain intra coded macroblocks. The intra coded 
maorobtooks within a P-slice are compressed in the same way as the Intra coded 
macroblocks in an l-slice. Inter coded blocks are predicted using motion estimaton 
and compensation strategies. 

If all the macroblocks of an entire frame are encoded and transmitted usmg 
0 intra mode, it is referred to as transmission of an MNTRAframe- (l-F,ame or l-Pi^ure). 
An INTRA frame therefore consists entirely of intra macrobtocks. Typically, an INTRA 
frame must be transmitted at the start of an image transmission, when the recover as 
yet holds no received macraWocks. If a frame is encoded and transmitted by 
encoding some or all of the macroblocks as inter macroblocks, then the frame ,s 
,5 referred to as an "INTER frame'. Typically, an INTER frame comprises less data for 
transmission than an INTRA frame. However, the encoder deckles whether a 
partteular macroblock is transmitted as an intra coded macroblock or an inter coded 
macroblock, depending on which is most efficient. 

Every 16x16 macroblock to be inter coded in a P-slice may be partitioned into 
30 16x8, 8x16, and 8X8 partitions. A sub-macroblock may itself be partitioned into 8x4 
4x8 or 4x4 subwracroblock partitton. Each macroblock partition or sub-macroblock 
partition in H.264 is assigned to a unique motion vector. Inter coded Macroblocks 
and macroblock partittons have unique predlctton modes and reference indices. It is 
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not allowed in the current H.264 standard for inter and intra predictions to be selected 
and mixed together in different partitions of the same macroblock. In the H.264/AVC 
design adopted in February 2002. the partitioning scheme initially adopted from 
Wiegand et al included support of switching between intra and inter on a sub- 
macroblock (8x8 luma with 4x4 chroma) basis. This capability was later removed m 
order to reduce decoding complexity. 

In P-pictures and P-sllces. the following additional block types are defined: 

Table 2 - Inter Macroblock types for P sUces 



inb_type 


Name of 
mb_type 


NumMbPart 
( mb_type ) 


MbPartPredMode 
(mb_type,0) 
Pred_LO 


MbPartPredMode 
( mb_type, 1 ) 
NA 


MbPartWldth 
(inb_type) 

16 


MbPartHeight 
(mb_type) 
16 


0 
I 


P_L0_16xl6 
P_L0_L0_16x8 


I 
2 


Pred_LO 


Pred_LO 


16 


8 


2 


P_L0_L0_8xl6 


2 


Pred_LO 


Pred_LO 


8 


16 


3 


P_8x8 


4 


NA 


NA 


8 


8 


4 


P_8x8refD 


4 


NA 


NA 


8 


8 


1 Inferred 


1 P_Skip 


1 


Pred_LO 


NA 


16 


16 



25 



FIG. 4 illustrates the combination of \>N0 (temporal) predictions for Inter coding 
a macroblock In a B-Picture or B-Sllce. 

AS illustrated in FIG 4. for a macroblock 410 to be Inter coded within B- 
plctures or B-slices, instead of using only one "Best Match" (BM ) predictor 
(prediction) for a current macroblock, two (temporal) predictions (BMLO and BML1) 
are used for the current macroblock 410. which can be averaged together to fbmi a 
final predicUon. In a B-picture or B-slice, up to two motion vectors (MVLO and 
MVL1) representing two esUmates of the motion, per sub-macroblock partition are 
allowed for temporal prediction. They can be from any reference pictures (Ust 0 
Reference and List 1 Reference), subsequent or prior. The average of the pixel 
values in the Best Matched blocks (BMLO and BML1) In the (List 0 and List 1) 
reference pictures are used as the predictor. This standard also allows weighing the 
pixel values of each Best Matched block (BMLO and BML1) unequally, instead of 
averaging them. This is referred to as a Weighted PredicUon mode and is useful in 
the presence of special video effects, such as fading. A B-sllce also has a special 
mode - Direct mode. The spatial methods used in MotionCopy skip mode, and the 
Direct mode are restricted only on the estimation of the motion parameters and not of 
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me macroblocks (pixels) themselves, and no spatially adjacent samples are used, 
in Direct mode the motion vectors for a macroblock are not explicitly sent. 

The following macroblock types are defined for use in B-picturas and B-sl,ces: 
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Table 3- Inter Macroblock types for B sUces 



Name of 
mb_type 



B_Direct_16xl6 



B_L0_16xl6 
B Ll_16xl6 



B_BL16xl6 
B L0_L0_16x8 



NumMbPart 
(mb_type) 



MbPartPredMode 
( mb_type, 0 ) 



MbPartPredMode 
( mb_type, 1 ) 



Pred_LO 
Pred_Ll 



B L0_L0_8xl6 



B_Ll_Ll_16x8 



B Ll_Ll_8xl6 



B L0_Ll_16x8 



B L0_Ll_8xl6 



B Ll_L0_16x8 



B Ll_L0_8xl6 



LO_Bi_16x8 



B L0_Bi_8xl6 



B Ll_Bi_16x8 




BiPred 



Pred_LO 



Pred_LO 



Pred_Ll 



MbPartWidth MbPartHeight 
( inb_type ) (mb.type) 



Pred_LO 



PredJLX) 



Pred_Ll 



Pred_Ll 



Pred_LO 



Pred_LO 



Pred_Ll 



d_Ll 



Pred_LO 



Pred_Ll 



Pred LI 



Pred LI 



Pred_LO 



Pred_LO 



Pred_LO 



BiPred 



BiPred 



BiPred 



BiPred 
BiPred 



BiPred 



BiPred 



NA 



BiPred 



,n B-slices, as shown in the ahove table, the two temporal predlcUons are 
always restricted to using the same bio* type. , ,„ 

Deblocking Alters, and Overlapped Block Motion Con.pensat,on (OBMC) 
some spa«al correlation. According to these methods, the reconstructed p,xe s^ 
rr Liction and the addition of the assodated residual, are spatially 
IsCil depending upon their mode (intra or Inter,, pos.on (MB.,oc 
. n.»r„,l Dlxels etc) motion infomiation, associated resKjual, and the 
:rn:"~. - p^cess .n considerably ^u. blocklr. 



wo 2005/022920 



PCT/US2004/027434 



artifacts and improve quality, but on the other har,d oan also increase comp ex.ty 
considerably (especially within the decoder). This process also n.ay not always 
yield the best results and it may itself Introduce additional blurring on the edges. 

Qi IM MARY OF THP INVENTION 

Exisflng Video compression standards (e.g., MPEG-2 and H.264) do not allow 
both intraframe (intra) and interframe (inter) predictions to be combined together (l,Ke 
me combination of two intertrame prBdtotions in the inter^ly b,-pred.ct,o,^) fo 
encoding a current macroblocK or a subblock. In acc^nlance w«h the pnncples 
the prelt lnven«on. provision is made for the combination of int.. P-^'-*"- 
inter predictions in the encoding and decoding of a given macroblock, subbtook. or 
par««on. The combination of Intra and Inter predictions enables improved ga,n 
and/orencodingeff,ciencyand/ormlghtfurther,educevideodataerrorpropaga.on. 

An embodiment of the inventton provides for video encoding a block by 
; combining a first prediction of a current block with a second prediction of a current 
hiock; Wherein the first prediction of the current block is intra pred,c.on and the 
second prediction of the current block is inter prediction. 

Throughout the following description it will be assumed that the lum.nance 
(luma) component of a macrobtock comprises 16x16 pbcels an^nged as an an.y of 4 
0 8X8 b ocks, and that the associated chrominance componenU are M^^^ 
rmpledbyafactoroftwoin.hehorizon.alandver«caldl,^ons.oform8x6bUx:ks 

rensU.nof«»descriptiontootherblocks.zesandothersub.sampling^^^^^^^ 

be apparent to those of ordinary skill in the art. The inventon .s not l,m.ted by the 
raxTel^block structure but can be used in any segmentation based videos 

25 system. 

poicT nPSHRIPTinM OF THF ORAWINGS bv 

Tbe above features of the present Invention will become more apparent by 

describing in detail exemplar embodiments thereof with r«fe,«nce to the attached 

3„ "^^^ TZ. the samples near a 4x4 pixel luma block to be intra coded, in 

^^Tjr:r.n;e1;ec«onsofpredlc«onsencod,ng 

Of FIG.1 . in accordance with the H.264 standard; 
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FIG 3 depicts a macroblock being inter coded by estimating a motion vector, 
in accordance with the H.264 standard; 

FIG 4 illustrates the Bi-predlction of a nnaaoblock by combining two inter 
codings, in accordance wiUi the H.264 standard; 

FIG 5 depicts intra-inter hybrid bi-prediction of a 4x4 block combining .nter 
and intra prediction, in accordance with the principles of ttie invention; 

FIG 6 is a block diagram illustrating a video encoder and a video decoder, .n 
accordance with ttie principles of the invention; 

FIG 7. is a block diagram illustrating a video encoder, in accordance with the 

principles of the invention; 

FIG 8. is a block diagram illustrating a video decoder, in accordance with the 

principles of the invention; and 

FIGS, 9A and 9B are block diagrams illustirating circuits for combining intra 
and inter predictions in the encoder of FIG 7 or the decoder of FIG 8. 

r. cTAii Pn nESC PiPTinN nF THE INVENTION 

F,Ci 5 depicts an example of hybrid intra-inter bi-prediction where the same 
4x4 block is predicted using Inter and intra prediction. FIG 5 illustrates a new bi- 
prediction mode type, herein calted the intia-inter hybrid coding mode, distinguished 
L the intra-only (F.Gs. 1 and 2) and inter-on^ (FIGs. 3 and 4) ^''^"'^^ 
me related art. wt,ich unlike the related art, can combine both spatial ( A through M 
301) and temporal (MV. 302) predictions to bi-predictively encode tt,e current 
macrobtock or current su!.block 110. This new bi-predictive (or multi-predictive) mode, 
p^vkies that two (or moie) p.«dic«ons, which may include one or more intra 
predictions, are to be used (combined) for making the final prediction of a given block 
or macroblock. Bi-prediction may be used also in l-plctures. wi«i '-^'"^ 
intra piedicttons. These two ma predictions could use two d^erent intra prediction 

""^""The disclosed hybrid bl-predictive coding mode allows both intraframe (intra) 
30 and mterframe (inter) predictions to be combined together (e.g., --S^J' " 
weighted) for encoding a current macroblock, sub-macoblock, - P-^'*""' '-^J 
ac^^lance with the principles of tt,e present invention, the related arts mem«. o 
combining predictions (bi-prediction or multi-piediction) is extended by providing for 
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,he combination of intra predictions and inter predictions for the encoding of a g.ven 
macrobiock, sub-macrabiock. or par«tion. Tf,e combination of intra and mter 
predictions allows for improved gain and/or encoding efficiency and/or may reduce 
video data error propagation. ' , ^ ^<„>m,„<.« 

Embodiments of the invention provide several ne«r macroblock modes/types 
mat can be integrated within a video encoder and decoder and couW further improve 
perfom^ance as compared with existing architectures. The new macroblock modes 
are similar to the bi-predictive (or mufti-predictM macroblock modes already 
several encoding architectures and standards such as MPEG.2 and H.264. .n me 
sense that they use two or more predictions for each macroblock or sub-block, bu 
they differ in the sense that they can also use (or only use) intraframe (spa .al) 
prediction, as contrasted with convenUonal Inter-only (temporal) bi-predictlon. It ,s 
possible for example, that the combination of two different intra predictions or the 
combined usage of inter and intra predictions would give a better prediction for a 
; given macroblock, while it could also be beneficial in reducing blocking artrfacts given 
mat adjacent spatial samples may be considered during the perfom,ance of the 
disclosed bH^redfetion coding meti^od. The disclosed method of comb,n,ng rntra 
and inter predlcttons for coding the same macroblock or subblock can lead to higher 
performance because a) either prediction may contain important distinct Infomiation 
0 mat is not preserved if only a single prediction is used, b) either picture may a.nta,n 
different encoding artifacts that can be reduced through averaging, or weighting, c) 
averaging functions as a noise reduction mechanism etc. 

Additionally, the disclosed bi-predictlve (or multi-predictive) macroblock 
coding mode supports inter prediction modes that are not constrained to use the 
« same partition types, and allows the use of ail possible combinations of ,n^ an 
slngle-list inter types that are defined in Tables 1 fl^-ough 3. The drsciose b. 
predictive (or multi-predictive) macroblock coding mode supports rnter and rntm 
predictions to be perfbm^ed based upon different partitions of the same macroblock 
.0 be coded. For example, if only up to two (bi) predictions per macroblock are 
30 altowed for a hybrid intra-lnter coded macroblock, the first prediction «.uld be 
intra4x4 (mb_type 0 In Table 1) while the second prediction could be the 16x8 hst 
btock prediction (mb_type 6 In Table 3). 
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^n""^ "fT' yhmnde types 
The prediction type(s) employed in hybrid-encoding each macroblock are 
signaled within the bitstream, either in a combined fom, (like the fom, used for B 
slices in H.264) or separately (i.e.. using a tree structure). Optionally, the number of 
predictions (e.g., 1, 2 or more) could also be signaled within the bit stream. The 
combined signaling method employed in the related art would necessitate the 
enumeration of all possible or the most likely combinations of prediction type(s) and 
may not result in the highest compression gain. The compression gain can be 
optlmteed by employing a separate tree-structured architecture, signaling separately 
each prediction mode. This method allows the use of all possible combinations of 
intra and single-list Inter types that are defined in Tables 1 through 3. while keeping 
syntax simple. For example. If only up to two (bi) predictions per mac«.block are 
allowed for a hybrid-coded mac«>block the fh^ prediction could be intra4x4 (mb Jype 
0 in Table 1) while the second prediction could be the 16x8 list 1 block prediction 
(mb type 6 in Table 3). For these additional submodes their associated parameters 
also'need to be transmitted, such as the Intra direction and/or the associated 
reference indices and motion vectors. This approach allows various combinations, 
such as both/all predictions being intra but having different directions, or being 
different-list predictions, or using different block partitions. 

It may also be preferable to make adiustments and extensions to the 
submodes provided In the H. 264 standard, since (a) some comblnattons are identical, 
and (b) it may be desirable in some cases to use a single prediction for a macroblock^ 
For example, for case (a), we may disallow Identical prediction modes and 
au.oma«cally adjust tt,e submode types, while for case (b) we can introduce the 
5 following addittonal modes, which define a new Null block predlctton type that implies 
no prediction: 
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B_N U L I O A o 


2 1 Null 


Pred_LO 


16 


8 
16 

.8 


B_NuU_L0_8xl6 


2 


Null 


Pred.LO 


8 


B_Nun_Ll_l6x8 


2 


Null 


Pred_Ll 


16 


B_NuU_Ll_8xl6 


2 


Null 


Pred_Ll 


8 


16 


B_L0_Nun_l6x8 


2 


Pred_LO 


Null 


16 


8 


B_L0_Null_8xl6 


2 


Pred_LO 


Null 


8 


16 


B_Ll_NuU_16x8 


2 


Prcd_Ll 


Null 


16 


8 


B_Ll_NuU_8xl6 


2 


Pred_Ll 


Null 


8 


16 



AcombinaUon of two null prediction block types for the sanne sub partiUon (e^g 
B Null LO 8X16 and B_Null_L1_8x16) is forbidden, wt,ich again lmpi,es tha 
adapu™ th'e related arVs mb.type table would provide further advantages. As,m,lar. 
m could be made for 8x8 subblocks/pa,«tions. Because ail bi-pred,c.ve 
ris defined in Table 3 can be supported by the disposed hybrid mode, they could 
be eliminated as redundant. 

■rvt^nHinn Direct m"-^° """^ hybrid intra-intef bi-predi<^ion 
The spatial Direct mode used in H.264 can be extended with an hybrid Intra- 
inter bi-predic«on mode embodiment of the invention. Curren«y, the motion vectors 
Direct mode blocK are detem-lned based on the median Of the mouonv^orso, 

^ "^Trb^lllae Of at .ast one neighbor . hybrid ,inter-int.) then 
«,e prediction mode of the current (Direct mode) block could also be hybr. Or^e. 
1 . This method of predic«on could be restricted according to ^ ^ 
pJc«on lists u«lized by the neighboring blocks. For « J^*^ ;^^;^ 
available in the spatial neighbors, then the motion vectors of a Direct mode block a« 
cite la id again using median prediction regardless if one of ttie neighbors utilizes 
Tvltl^ on heomerhand,»onlyonelistisavaiiab.e(e.g.list_1),whileone 

utilizes hybrid prediction, then the Direct mode block wi.i be 
iLlLd 1 hybrid prediction as well, while also utilizing the same ava,iab.e i, t 
. I more Ln one adlacen. blocks utiiize hybrid prediction, then some s.mple 
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rules can be deflned regarding the intra mode that is to t>e used for the pred.c*on. 
For example, .ntra16x16 should for this case supercede Intra4x4 due to simp^^crty 
for the prediction. Intra4x4 could also be disallowed within Direct mode blocks, but . 
used then every 4x4 block direction may be predicted initially using the external 

5 macr^block predicttons If available, if no external Intra4x4 macroblock is available 
then the lowest intra mb.type available from the adjacent blocks is used. In general, 
if the prediction mode of more than one of the spa«ally neighboring blocks is ,ntra 
then the lowest order intra prediction is preferably used, while intra is prefe.^bly not 
used if both lists are available. Intra Is also not used it the samples required for the 

10 prediction are not available. 

in-LoQD Deh lfif kina Filter 
Video encoders In accordance with the H.264/AVC standard may employ an 
i„.,oop deblocking filter to increase the correlation between adiacent pixels mamly at 
,5 block and macroblock MB edges and to reduce the blockiness (blocWng 

introduced in a decoded picture. The filtered decoded pictures are used to pred,<* the 
.notion for other pictures. The debtocklng filter is an adaptive filter that ad.usts 
strength depending upon compression mode o, a macroblock (Intra "'^J'^^'^-^^ 
quantization parameter, motion vector, frame or field coding decs,on and the p«^ 
,0 values. For smaller quantization sizes the filter shuts itself off. Th,s filter can also be 
shut.>ff explicitly by an encoder at the slice level. Dlffor»nt strengths and me*ods 
for the deblocking filter are employed depending on the adjacent block cod.ng types, 
,„ofion and transmitted residual. By taking advantage of the 'eatu^s of an 
embodiment of the current invention, since the hybrid intra-inter -acrobiock type 
as already contains an intra predictor that considers adjacent pixels dunng .ts encod, g, 
«,e strengtt, of the deblocking filter may be modffied accordingly. For e«mp.e. a 
block is hybrid coded and its prediction mode uses two (or rnore) P~d'*°-' ^ 
Which one is intra, and has additional coefficients, then the filter strength for the 
corresponding edges can be reduced (e.g., by one). .-.rfiOAand 
Fia 6 Shows a block schemafic diagram illustraUng a vdeo encoder 604 and 
a Video decoder 605 in acco-dance with an embodiment of the present invention. 
Encoder 604 receives data about an image sequence frem a video source 602 e.g., 
a camere. After compressing and encoding the data from the camera In accordance 
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with the methods herein disclosed, the encoder 604 passes the infonnatlon to a 
transmission system 606. Transmitter 606 transmits the bitstream containing the 
hybrid-encoded macroblocks over a channei medium 611. The transmission channel 
medium 611 may be via v^ireless. cabie. or any other transmission or « scheme. 
5 The circuitry of either encoder 604 or decoder 605 may, for example, fom, part of a 
mobile or a portable two-way radio, or a mobile phone. 

The moving image decoding apparatus (decoder 605) on the decoding side 
of the channel medium 611 receives the bftstream In a bitst,«am buffer 631. and 
parses the bitstream (with bitstream parser-processor 633) and decodes the encoded 
,0 image data fiom a intra coded frame A. stores the decoded image data of the frame A 
in a frame memory 635. Upon ^caption of inter encoded difference data he 
decoding apparatus (decoder 605) generates a moUon compensation predicted 
..agefromthedecodeddataoftheframeA. ,f there is no enor In the i^ceived da^ 
of the frame A. since the decoded Image o, the frame A matches the local deco ed 
,a image of the frame A on the encoding apparatus (encoder 604) side, the mo^n 
Jpensatlonpredictedlmagegeneratedfromthisdec^deddatamatchesthemotion 

compensaaon p,«dlcted Image on the encoding apparatus (encoder 604) s^e. Since 
the encoding apparatus (encoder 604) sends out the difference Image between the 
image and the motton compensation predicted Image, the decoding 
ao Tpltus can generate a decoded image Of the frames by adding m^^^^^^^^ 
Tmpensation predicted image to the received difference image. «^'^^/J^ 
frame A reoe^e6 by the decoding apparatus contains an error, a correct decoded 
Lge of the frame A cannot be generated. As a consequence, ail images generated 
J the error-contalning portion of the motion compensation predicted image 
Lcome enoneous data. These erroneous data remain until refresh processing is 

performed by intra-encodlng. 

When the decoder 605 receives hybrid-encoded Bwp.«d,ct,ve data, hybnd- 
coded block data containing intra coded lnfom,ation can be employed at the decoder 
topreventoranastthepropagattohoflmagedataerrors. Some portons or areas o 
30 Tobiects v^thln an image or sequence of images may be identifiable y he user or 
y th^ encoding apparatus as being mo. important or more -cep '.le °f er^^^ 
«nan others areas. Thus, the encoding apparatus (encoder 604) may be adapted to 
iTolV hyt-Hd-encode the macroblocks in the more^mportant a^as or objects in a 
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frame sequence, so that Image data error propagation in those ar«as Is prevented or 
reduced^^^ 7 depicts an exemplary encoder, according to an embodiment of the 
inven«on, indicated generaliy by the reference numeral 700. The encoder 700 
inciudes a video input tenninal 712 that is coupled in signal communrcation to a 
pos,tiveinputofasummingblock7,4. The summing biocR 714 is coupled, ,nu.. 
,0 a function block 716 for impiementing an integer transfom, to provrde coeffrcents. 
The funcuon block 716 is coupled to an entropy coding block 718 for implementing 
Ilpy coding to provide an output bltstream. 'T^'^ 
, coupled to an in-loop portion 720 a. a scaling and inverse transform block 722^ The 
funln btock 722 .s coupled to a summing block 724. which, in turn, ,s courted to an 
TnLrame p,«1ic«on block 726. The intra-frame prediction block 726 s a f,.t .^u 
Of a combining unit 727, the output of which is coupled to a second input of the 
summing block 724 and to an inverting input of the summing block 714. 
„ The output Of the summing block 724 is coupled to a deblocking filter 740. 

The deblocking filter 740 is coupled to a frame store 728. The ♦^-e ^™ J28 ,s 
I^upled to a motion compensation (inter-frame prediction) biock 730. whrch ,s 
coupled to a second Input of the combining unit 727. ' .. 

The combining unit 727 combines a first (intra) prediction, from the intra- 
.0 frame prediction block 726. with a second (inter) predlcUon from me rr»^n 
compensa«on (inter-frame predicUcn) block 730. to output a '^^^'l'^^'^ 
(hyl^d intra-lnter) predic«on to the second input of the summing block 724 and to an 
vt^ng input of the summing block 714. In some embodiments of the rnvento . 
.nilg unit 727 may be implemented as a summing block (e.g s^ ar to 
as summingb,ock724or714)operaUvelycoupledtooneormo,«ga,nbtocks(SeeFa 

9A) to produce either an "average" of the input intra and inter P-"<*°-. °- 
dlently weighted combination of the intra and Inter predictions. In omer 
^ the invention, the combining unit 727 may be impiementod as a 
adder ci.ou« adapted to combine a first intra P^««- " 
30 frame predictk>n block 726 with a second (e.g.. subsequent) intra predrctron 

^fr^me p^icUon block 726: and further adapted to comb.ne a first n ra) 
rdilTfron, the intra-frame predic«on block 726 «.th a second (inter) predrcUon 
from the motion compensation (interJrame predicUon) block 730. 
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The video input terminal 712 is further coupled to a motion estimation block 
ri9 to p^vide motion vectors. The deblocking filter 740 is coupled to a se«>nd 
i„puto,themotiones«mation(inter.f.mepredlction)block719. The outpu of the 
Ion estimation block 719 is coupled to tt,e motion compensatK,n (.n er^me 
p,ediction)block730 as well as to a second input of the entiopy -"^'"^ Jj*. 

The Video input temilnal 712 Is further coupled to a coder control block 760. 
The coder control block 760 Is coupled to control inputs of each of the blocks 716, 
718, 719, 722. 726. 730, and 740 for providing control signals to central the operation 

of the encoder 700. ^;^^„f of thP 

Fia 8, depicts an exemplaiv decoder, according to an embod.ment of me 
invention, indicated generally by the reference numeral 800. The decoder^ 

includes an entropy decoding block 810 for receding - .'y^^-^ ^ 
decoding block810is coupled for providing coefficients to an ,n-loopport.^^^^^^^^ 

scaling and inverse transfom, block 822. The inverse transfom, "took 822 « 
,5 coupled to a summing block 824, which, in tun., is coupled to an ,n^- rame 
r::lnb,ook826. Theintra.framepredictionbiock826iscou,ed.at^^^^^^^^ 
Of a combining unit 827, the output of which is coupled to a second ,nput of the 

""""■'TottStof ti,e summing block 824 is coupled to a 
.0 providing outpu. images. The deblocking filter 840 is coupled to a frame store 828 
" CfraL L 828 is coupled to a motion compensation (inter-^me p.d,ct^ 
btock 830, which is coupled to a second input of the combining un.t 827^ 1^ 
d!^ing block 810 is further c»upled for providing motion vectors to a second mput 
of ttie motion compensation (inter-frame prediction) block 830. 
^ The decoder combining unit 827 is similar in functK^n to «,ecomb,n,ng n,t 

727intheencoderofBa7in.hat«combinesaflrst(lntra)predlctlon.f«>mtt,e,nra- 
le prediction block 826, with a second (inter) prediction from the motio 
Zensation (inter-frame prediction) block 830, to output a resulting combine 
:Z,ntra-intlr) prediction to the second inpu.of«,e summing block 82^^^^^^^ 

30 elodiments of «,e invention, the combining un« 827 may be -P'— ^ ^ 
summing block (e.g., similar to summing block 824) '^^^^ 
n^e gain blocks (See FIG 9A), to produce either an "average" of ttie , put ,n^ and 
IT predictions, or a differently weighted combination of the .ntra and .nter 
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predictions. In other embodiments of the invention, the combining unit 827 may be 
implemented as a sequential adder ci«:uit adapted to combine a first intra prediction 
from the intra-frame predicHon block 826 with a second (e.g., subsequent) ,ntra 
prediction from the infra-frame prediction block 826; and further adapted to .»mbme 
6 a first (infra) predlctton from the infra-frame prediction block 826 with a second (mter) 
prediction from the motion compensation (inter-frame prediction) block 830. 

The enfropy decoding block 810 is further coupled for providing input to a 
decoder control block 862. The decoder confrd block 862 is coupled to confrol 
inputs of each of the blocks 822, 826, 830, and 840 for communicaUng confrol signals 
10 and confrolling the operation of the decoder 800. 

FIGS 9A and 9B are block schematic diagrams each illusfrating an 
exemplary embodiment of the combining unit (e.g.. 727 or 827) in the encoder of BG. 
7 or in the decoder of Fia 8. comprising circuits for additively combining a first and 
second prediction, e.g., additively combining an infra and an inter prediction. FIG 
,e 9A depicts an exemplary combining unit x27-a (e.g.. for Implementing co-b,"«^ 
727 and 827) Including a adder circuit (denoted by the Sigma signal) A27 being 
adapted to combine a first intra prediction from the coupled intra-frame predichon 
block (eg 726 or 826) with a second (e.g., subsequent) intra prediction from the 
infa-frama'p^diction block (e.g., 726 or 826); and being further adapted to combine 
,0 a first (infra) prediction from the Infra^rame p,«dlc«on block (e.g., 726 - 826) «m a 
second (inter) predicUon from the moUon compensation (inter-frame p,Bdu*on) block 
(e o 730 or 830). Digital Gain Blocks G1 , G2, and G3 provide for weighting, (or 
Simply averaging), of the plurality (e.g., two) predictions to be combined. Persons 
Skilled in the art will recognize that in alternative embodiments of the inventon, fewer 
25 than three (e.g. one or two) digital gain blocks may be provided to combine two 
predictions, for example, as depicted In FIG 9B. 

FIG 9B depicts an exemplary combining unit x27.b (e.g., 727 and 827) 
including a adder circuit (denoted by the Sigma signal) A27 being adapted to average 
a firs, infra predicfion from the intra-frame predicfion block (e.g., 726 or 826^ ™m a 
30 second (e.g., subsequent) intra predictton f-em the intra-frame prediction block (e.g.. 
72^^ 826); and being fur«,er adapted to average a firs, (Infra) prediction rom he 
infra-frame prediction block (e.g.. 726 or 826) Wrth a second (inter) predicfion fi^ the 
rjroompensat^n (Infer^me predicfion) ^ (e.g.. 730 or 830). Digital Gam 



wo 2005/022920 



PCT/US2004/027434 



20 



me sum Of hvo predictions output from U,e adder circuit (denoted by the S,gma 

"'""'^Ls aspects o, the present invention can be implemented in software. 
Which may be run in a generai purpose computer or any other suitable computing 
:;lnment. The present invenUon is operabie in a number gene., purpose or 
iai purpose computing environments such as persona, computers, gene., 
purpose computers, server computers, hand-heid devices, .aptop devces, 
Zlssol, microprocessors, set top boxes, p^grammabie «>r.umer 
, r. cLnics, network PCs, minicomputers, mainframe computers. d,str,bu.^ 
lpu«ng environments and the .i.e to execute computer^xecutabie inst^c^on 
Telmir^ a frame-.o-frame digita. video encoding of the present invenbon wh.ch , 
Se on a computer readabie medium. The present invention may be .mpiemented 
r ir. or in - computer-executabie ins.ruc«ons, such as program modu.es 
. executed by a computer. Generaiiy, program modu.es inciude roubnes 

Lrams, oblects. components, data st^ctures and «,e .i.e to perform part,cu.a 
Z or to .mp.emen. particuiar abstract data types, .n a distributed comp ^ng 
enlnment, program modu.es may be located in local or remote storage dev,ces. 
ETmpLry embodiments of .he invention have been explained above and are 
.0 shownlnthe«gu.s. However, the present inventton is not limited to theexe^^^^^ 
embodiments described above, and it is apparent that vanabons 
. can be effected by those sKllled in the art v^thin the spirit and scope °f the Pre en 
rition. The^fore. the exemplary embodiments should be understood no 
Stations bu, as examples. The scope of the present invention is no, detem„ned 
, rte above description but by the accompanying .aims a™. ^nat»ns and 
modincations may be made to the embodiments of the invention without departing 
Ltlt rpe Of me inven^on as denned by the appended .aims and e<,u^^^^^^^ 



